Github: https://github.com/esthermy63/FinchMaPyoYang_ENV872_EDA_FinalProject.git

Rationale and Research Questions

As climate change continues to intensify, federal investment in energy and environmental programs has become an important part of how states manage climate risks and build resilience. However, it is still not very clear what influences changes in this type of federal spending or why certain states receive more support than others. Understanding these patterns matters for environmental management because federal resources strongly affect a state’s ability to respond to climate impacts and plan for long term sustainability.

Recent research also shows that climate risks may influence financial decision making. For example, Marshall et al. (2021) find that climate disasters such as hurricanes or floods increase public attention to environmental issues and lead to greater investment in green financial products. They describe this as a salience effect, meaning that when climate risks become more visible, people are more willing to direct their attention and money toward environmental solutions. This idea raises an important question for our study. If climate disasters shape private investment behavior, it is reasonable to ask whether they may also influence decisions about federal energy and environmental spending.

Public attitudes may also play a role. When voters express higher concern about climate change, this can create political pressure that encourages more federal investment in environmental programs. Together, these ideas suggest that both climate risks and public opinion could help explain changes in federal spending patterns.

For these reasons, our project examines federal energy and environment investments from 2021 to 2024. This period includes several major climate events, shifts in national climate policy, and rising public awareness of environmental issues. We aim to understand how federal spending has changed during these years and what factors may be associated with these changes across states.

Our research questions are:

  1. How does federal energy and environment investment change from 2021 to 2024?

  2. What factors affect total federal energy and environment spending?

  3. Where is federal energy and environment spending being invested (i.e. where does the money go)?

Data sets used for the study are listed as follow:

Environment and Energy Investment Data Description
Data Source Portland State University - THE FEDERAL ENERGY AND ENVIRONMENT INVESTMENT PROJECT
Retrieved from https://www.pdx.edu/policy-consensus-center/federal-energy-environment-investment-project
Variables Used All federal grants for a state in FY 2021 to 2024, Federal grants of energy and investment for a state in FY 2021 to 2024
Date Range 2021-2024
Demographic Data Description
Data Source United States Census Bureau
Retrieved from https://www.census.gov/data/tables/time-series/demo/popest/2020s-state-total.html
Variables Used Total population, Male, Female, Median age, Median household income, Gas, Electricity, All other fuels, No fuel used, Year, Child (age 0-17)
Date Range 2021-2024
Disaster Frequency Description
Data Source OpenFEMA Dataset: Disaster Declarations Summaries - v2
Retrieved from https://www.fema.gov/openfema-data-page/disaster-declarations-summaries-v2
Variables Used Disasters
Date Range 2021-2024
Public Awareness on Climate Change Description
Data Source Yale Program on Climate Change Communication
Retrieved from https://climatecommunication.yale.edu/visualizations-data/ycom-us/
Variables Used Estimated % of adults who are somewhat or very worried about global warming
Date Range 2021-2024
Spatial Data Description
Data Source United States Census Bureau
Retrieved from https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_state_20m.zip
Variables Used State, Geometry
Date Range 2018

Dataset Information

We selected 10 variables of interest from different data sources, from demographic characteristics, disaster frequency, and percentage of public awareness of climate change. Each annual dataset was processed and summarized in R. For every year, the resulting dataset contains 51 observations (one row per state, plus D.C.) and 12 columns (10 variables, state name and year).

The main 12 columns include:

Column Name Unit Description
Total_EE_Investment $ Federal investment (grant and loans) on energy and environment
Political Score % % of adults who are somewhat or very worried about global warming
Disasters Count Number of declared disasters happened
Total Population People -
Gender Ratio % Female / total population
Child % % of people under 17 years old
Median Age Age -
Median Household Income $ -
Gas % % of the people use gas as a heating fuel
Electricity % % of people use electricity as a heating fuel
State - -
Year - -

Below is an example summary output for the 2021 dataset, and for further analysis, dataset from each year were bound by rows, producing a final dataset of 204 objects of 10 variables.

Summary Table of 2021 Dataset
Mean Standard Deviation Max Min
Total_EE_Investment 3075957471.9 3129531289.2 1.786936e+10 632529051.0
Political_Score 70.9 3.9 7.980000e+01 62.3
Disasters 53.9 118.7 5.540000e+02 1.0
Total population 6507720.5 7397954.7 3.923784e+07 578803.0
Female 0.5 0.0 5.000000e-01 0.5
child 0.2 0.0 3.000000e-01 0.2
Median age (years) 39.1 2.2 4.470000e+01 31.8
Median household income (dollars) 69243.8 11314.4 9.020300e+04 48716.0
Gas 0.5 0.2 8.000000e-01 0.1
Electricity 0.4 0.2 9.000000e-01 0.1
Summary Table of Final Dataset
Mean Standard Deviation Max Min
Total_EE_Investment 3499803975.6 3637178194.5 2.488060e+10 632529051.0
Political_Score 72.1 4.3 8.370000e+01 61.8
Disasters 39.8 74.9 5.540000e+02 1.0
Total population 6569643.1 7416319.5 3.943126e+07 578803.0
Female 0.5 0.0 5.000000e-01 0.5
child 0.2 0.0 3.000000e-01 0.2
Median age (years) 39.3 2.2 4.510000e+01 31.8
Median household income (dollars) 75686.5 13076.2 1.097070e+05 48716.0
Gas 0.5 0.2 8.000000e-01 0.0
Electricity 0.4 0.2 9.000000e-01 0.1

Exploratory Analysis

Prior to performing the regression analysis, we conducted an exploratory analysis by examining the distribution of independent variables. The [Figure 1] indicates that both federal investment amounts and disaster frequencies show a strong right-skewed distribution, with a few states having exceptionally high values compared to the majority. These skewness can violate the assumptions of Ordinary Least Squares (OLS) regression, potentially leading to heteroscedasticity and unreliable coefficient estimates. To solve this issue, we applied a logarithmic transformation to the investment (Total_EE_Investment), disaster (Disasters), and population (Total population) variables for stabilizing the variance and ensuring the robustness of statistical models.

[figure 1: Scatter Plot]

[figure 1: Scatter Plot]

As part of our exploration on factors driving investment, we looked at where the most natural disasters occur. If climate disasters influence investment behavior, we might expect that the states with the highest frequency of disasters to also have the greatest federal investments in the energy and environment sectors.

## Coordinate Reference System:
##   User input: NAD83 
##   wkt:
## GEOGCRS["NAD83",
##     DATUM["North American Datum 1983",
##         ELLIPSOID["GRS 1980",6378137,298.257222101,
##             LENGTHUNIT["metre",1]]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433]],
##     CS[ellipsoidal,2],
##         AXIS["latitude",north,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433]],
##         AXIS["longitude",east,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433]],
##     ID["EPSG",4269]]
<<<<<<< HEAD
=======
>>>>>>> a8ac1d7 (final draft)
[figure 2: Frequency of Natural Disasters(2021-2024)]

[figure 2: Frequency of Natural Disasters(2021-2024)]

The map shows that Texas (n=699), Florida (n=660), and Louisiana (n=650) have the highest declared disasters from 2021-2024. The highest prevalence of natural disasters appears to be concentrated in the South region of the United States.

Next, we looked at the states that received the most federal grants and loans towards energy and environment.

<<<<<<< HEAD
=======
>>>>>>> a8ac1d7 (final draft)
[figure 3: Total Energy and Environment Investment Map by States(2021-2024)]

[figure 3: Total Energy and Environment Investment Map by States(2021-2024)]

The highest federal energy and environment investments from 2021 to 2024 were in California, New York, and Texas.

State Federal energy and environment spending
California $82,488,010,951
New York $54,029,205,539
Texas $45,365,375,638

California, New York, and Texas are among the highest populous states and are therefore expected to receive more federal investment. We next looked at the ratio of energy and environment spending compared to total federal spending across the US.

<<<<<<< HEAD
=======
>>>>>>> a8ac1d7 (final draft)
[figure 4: Ratio of Energy and Environment Investment to Total State Budget Map(2021-2024)]

[figure 4: Ratio of Energy and Environment Investment to Total State Budget Map(2021-2024)]

States with the highest ratio of spending towards energy and environment programs are DC, Wyoming, South Dakota, and Montana.

State Ratio of Federal energy and environment spending
District of Columbia 0.14
Wyoming 0.13
South Dakota 0.12
Montanta 0.11

Analysis

Question 1: How does federal energy and environment investment change from 2021 to 2024?

<<<<<<< HEAD
[figure 5: ARIMA forecast (5 years) - National EE Spending(2021-2024)] =======
[figure 5: ARIMA forecast (5 years) - National EE Spending(2021-2024)] >>>>>>> a8ac1d7 (final draft)

[figure 5: ARIMA forecast (5 years) - National EE Spending(2021-2024)]

Characteristic Beta1 SE
Year 8,170,150,801* 2,877,088,748
1 *p<0.05; **p<0.01; ***p<0.001
Abbreviations: CI = Confidence Interval, SE = Standard Error

Federal energy and environment investment shows a significant upward trend from 2021 to 2024. A linear regression of annual national spending against time indicates that investment increased by approximately $8.17 billion per year (p = 0.0118), demonstrating a clear positive temporal relationship. This finding is reinforced by a Mann–Kendall trend test (tau = 0.503, z = 2.88, p = 0.00399), which confirms a statistically significant monotonic increase in investment over the period. Together, these results provide strong evidence that federal spending in the energy and environment sectors has grown substantially and consistently across the 2021–2024 window. Forecasts from an ARIMA model suggest that investment levels are likely to remain high in upcoming years, continuing the upward trajectory.

Question 2: What factors affect total federal energy and environment spending?

This section investigated the drivers of federal energy and environmental investment across U.S. states in 2021, specifically examining the roles of environmental necessity (natural disasters) and political demand (public opinion).To establish a basis for our study, we first conducted a multivariable regression analysis to identify which socio-economic factors beyond just population—influence federal investment. We used AIC method to refine the model from income, gender, age, and fuel types to the most statistically relevant driver.

Characteristic Beta1 SE
log_pop 0.75*** 0.043
Female -7.4 5.45
child -11** 4.07
Median age (years) -0.10** 0.034
Median household income (dollars) 0.00 0.000
Gas -0.05 0.374
Electricity -0.49 0.383
1 *p<0.05; **p<0.01; ***p<0.001
Abbreviations: CI = Confidence Interval, SE = Standard Error

As expected, population was the most powerful predictor (t=17.33), explaining the vast majority of the variance in investment. This confirms that any subsequent analysis of disasters or politics must control population variable.Interestingly, Median Household Income was found to be statistically insignificant and was removed during the AIC process. This suggests that in 2021, federal energy and environmental funding was distributed relatively independently of a state’s wealth.The Electricity (proportion of homes heated by electricity) and age-related variables (Median age, child) as significant or marginally significant predictors.

<<<<<<< HEAD
## 
## Call:
## lm(formula = log_EE_investment ~ log_disaster, data = log_combined2021, 
##     na.action = na.exclude)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.2795 -0.5452 -0.0953  0.3944  1.8440 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  20.88956    0.20759 100.631  < 2e-16 ***
## log_disaster  0.24170    0.06685   3.616 0.000867 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7003 on 38 degrees of freedom
##   (11 observations deleted due to missingness)
## Multiple R-squared:  0.256,  Adjusted R-squared:  0.2364 
## F-statistic: 13.07 on 1 and 38 DF,  p-value: 0.000867
[figure 7:Correlation between Natural Disaster Frequency and Energy Investment] =======
Characteristic Beta1 SE
log_disaster 0.24*** 0.067
1 *p<0.05; **p<0.01; ***p<0.001
Abbreviations: CI = Confidence Interval, SE = Standard Error
[figure 7:Correlation between Natural Disaster Frequency and Energy Investment] >>>>>>> a8ac1d7 (final draft)

[figure 7:Correlation between Natural Disaster Frequency and Energy Investment]

In Model 2, we tested whether disaster frequency alone predicts federal investment. The results indicated a statistically significant positive relationship (t = 3.616, p < 0.001).Specifically, the coefficient of 0.24 suggests a 1% increase in disaster frequency is associated with approximately a 0.24% increase in federal investment. Furthermore, the model yielded an R-squared of 0.256, meaning that disaster frequency alone explains about 25.6% of the variance in funding allocation. While these initial findings support the hypothesis that funding is reactive to environmental needs, the low R-squared value also indicates that nearly 75% of the variation remains unexplained. This limitation, combined with the lack of demographic controls, raises the critical question of whether this correlation is truly causal or merely a reflection of population effect.

<<<<<<< HEAD
## 
## Call:
## lm(formula = log_EE_investment ~ `2021`, data = log_combined2021, 
##     na.action = na.exclude)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.55711 -0.40066  0.08767  0.42385  1.87172 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 16.94832    1.87981   9.016 5.58e-12 ***
## `2021`       0.06458    0.02648   2.439   0.0184 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7295 on 49 degrees of freedom
## Multiple R-squared:  0.1082, Adjusted R-squared:  0.09005 
## F-statistic: 5.948 on 1 and 49 DF,  p-value: 0.0184
[figure 8:Public Awareness as a Determinant of Energy and Environmental Investment] =======
Characteristic Beta1 SE
log_disaster 0.24*** 0.067
1 *p<0.05; **p<0.01; ***p<0.001
Abbreviations: CI = Confidence Interval, SE = Standard Error
[figure 8:Public Awareness as a Determinant of Energy and Environmental Investment] >>>>>>> a8ac1d7 (final draft)

[figure 8:Public Awareness as a Determinant of Energy and Environmental Investment]

Similarly, Model 3 examined the relationship between public opinion on climate change and federal investment. The results indicated a significant positive correlation (p < 0.05), suggesting that states with higher climate awareness tend to attract more funding.The positive coefficient indicates that a 1% increase in the climate-conscious voting base correlates with a measurable increase in federal funds. This supports the proactive funding hypothesis, where investment follows the demand for climate policy.

However, these models were limited by potential confounding factors and limited sample size. To address this, we constructed a Final Integrated Model using merged year data from 2021 to 2024.

Question 3: Where is federal energy and environment spending being invested (i.e. where does the money go)?

The Final Integrated Model is crucial because it tests both drivers simultaneously while controlling for population. As the final result indicates, once we control for population, the frequency of the disaster effect disappears, while the political effect remains. It shows that political will is a true independent driver, whereas disaster funding is largely a function of state population.

Characteristic Beta1 SE
log_disaster 0.01 0.022
Political_Score 0.01* 0.007
log_pop 0.66*** 0.028
as.factor(Year)

    2021
    2022 -0.01 0.070
    2023 0.12 0.068
    2024 0.24*** 0.067
1 *p<0.05; **p<0.01; ***p<0.001
Abbreviations: CI = Confidence Interval, SE = Standard Error

When analyzing the final merged data set (2021-2024) and controlling for population size, the influence of disaster frequency (log_disaster) became statistically insignificant (p = 0.61). This contrasts with the prior analysis and suggests that the previously observed correlation between disasters and investment was largely driven by state population.

However, political pressure on environment(Political_Score) remained a significant predictor (p < 0.05) even after controlling for population. This implies that states with a higher percentage of voters concerned about climate change actively attract more federal investment, independent of their population.

Summary and Conclusions

This study investigated the drivers of federal energy and environmental investment across U.S. states from 2021 to 2024 by integrating socio-economic data, disaster frequencies, and public opinion.

Question 1: How does federal energy and environment investment change from 2021 to 2024?

Our temporal analysis confirms a statistically significant increase in federal investment over the four-year period.The time series trend shows a consistent monotonic rise in spending, culminating in a notable surge in 2024. This trajectory reflects a structural expansion in federal support for energy and environmental initiatives, likely driven by the implementation of major policy frameworks.

Question 2: What factors affect total federal energy and environment spending?

The study establishes that population size is the dominant predictor of funding allocation.This suggests that federal funds are distributed broadly in proportion to the number of residents, rather than favoring wealthier states or targeting lower-income areas specifically based on income metrics.

Question 3: Where is federal energy and environment spending being invested (i.e. where does the money go)?

While initial bivariate analyses suggested that funding increases with natural disaster frequency, our final integrated model revealed this to be a spurious correlation driven by state size. Once population was controlled for, the influence of disasters became statistically insignificant. This implies that federal funding follows major population centers, which naturally experience higher aggregate disaster counts, rather than responding purely to the frequency of events. In contrast, political pressure (public awareness of climate change) remained a significant positive even after controlling the population variable. This confirms that political will acts as a true independent driver.

In summary, the allocation of federal energy and environmental funds from 2021 to 2024 is best characterized as demographically scaled but politically responsive. While the foundational volume of investment is determined by population size, the strategic variations above this baseline are driven by the civic demand for climate action, rather than a direct reactive response to natural disaster frequency.

References

Marshall, B. R., Nguyen, H. T., Nguyen, N. H., Visaltanachoti, N., & Young, M. (2021). Do climate risks matter for green investment? Journal of International Financial Markets, Institutions & Money, 75, 101438. https://doi.org/10.1016/j.intfin.2021.101438